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ABSTRACT 

We combine data from a number of N-body simulations to predict the abundance 
of dark halos in Cold Dark Matter universes over more than 4 orders of magnitude in 
mass. A comparison of different simulations suggests that the dominant uncertainty 
in our results is systematic and is smaller than 10-30% at all masses, depending on 
the halo definition used. In particular, our "Hubble Volume" simulations of rCDM 
and ACDM cosmologies allow the abundance of massive clusters to be predicted with 
uncertainties well below those expected in all currently planned observational surveys. 
We show that for a range of CDM cosmologies and for a suitable halo definition, the 
simulated mass function is almost independent of epoch, of cosmological parameters, 
and of initial power spectrum when expressed in appropriate variables. This univer- 
sality is of exactly the kind predicted by the familiar Press-Schechter model, although 
this model predicts a mass function shape which differs from our numerical results, 
overestimating the abundance of "typical" halos and underestimating that of massive 
systems. 

Key words: cosmology:theory - dark matter - gravitation 



1 INTRODUCTION 



Accurate theoretical predictions for halo mass functions are 
needed for a number of reasons. For example, they are a 
primary input for modelling galaxy formation, since current 
theories assume galaxies to result from the condensation of 
gas in halo cores (White & Rees 1978 and many subse- 
quent papers; see Somerville & Primack 1999 for a recent 
overview) . The abundance of the most massive halos is sen- 
sitive to the overall amplitude of mass fluctuations while the 
evolution of this abundance is sensitive to the cosmological 
density parameter, Qq. As a result, identifying massive halos 
with rich galaxy clusters can provide an estimate both of the 
amplitude of the primordial density fluctuations and of the 
value of Qo', recent discussions include Henry (1997), Eke 
et al (1998), Bahcall & Fan (1998), Blanchard & Bartlett 
(1998) and the Virgo consortium presentation of the Hub- 
ble Volume simulations used below (Evrard et al 2000). The 
mass function is also a critical ingredient in the apparently 
strong constraints on cosmological parameters (principally 
Slo and A) which can be derived from the observed incidence 



of strong gravitational lensing (e.g. Bartelmann et al 1998; 
Falco, Kochanek & Munoz 1998). 

As the observational data relevant to these issues im- 
prove, the need for accurate theoretical predictions in- 
creases. By far the most widely used analytic formulae for 
halo mass functions are based on extensions of the theoret- 
ical framework first sketched by Press & Schechter (1974). 
Unfortunately, none of these derivations is sufficiently rigor- 
ous that the resulting formulae can be considered accurate 
beyond the regime where they have been tested against N- 
body simulations. In this paper, we combine mass functions 
from simulations of four popular versions of the cold dark 
matter (CDM) cosmology to obtain results valid over a wider 
mass range and to higher accuracy than has been possible 
before. These mass functions show a regularity which allows 
them all to be fitted by a single interpolation formula de- 
spite the wide range of epochs and masses they cover. This 
formula can be used to obtain accurate predictions for CDM 
models with parameters other than those we have simulated 
explicitly. 

Although the analytical framework of the Press- 
Schechter (P-S) model has been greatly refined and extended 
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in recent years, in particular to allow predictions for the 
merger histories of dark matter halos (Bond et al 1991, 
Bower 1991, Lacey & Cole 1993), it is well known that the 
P-S mass function, while qualitatively correct, disagrees in 
detail with the results of N-body simulations. Specifically, 
the P-S formula overestimates the abundance of halos near 
the characteristic mass M* and underestimates the abun- 
dance in the high mass tail (Efstathiou et al 1988, Efstathiou 
& Rees 1988, White, Efstathiou & Frenk 1993, Lacey & 
Cole 1994, Eke, Cole & Frenk 1996, Gross et al 1998, Gover- 
nato et al 1999). Recent work has studied whether this dis- 
crepancy can be resolved by replacing the spherical collapse 
model of the standard P-S analysis by ellipsoidal collapse 
(e.g. Monaco 1997ab, Lee & Shandarin 1998, Sheth, Mo & 
Tormen 1999). Sheth, Mo & Tormen (1999) were able to 
show that this replacement plausibly leads to a mass func- 
tion almost identical to that which Sheth & Tormen (1998) 
had earlier fitted to a subset of the numerical data we anal- 
yse below. Note that at present there is no good numerical 
test of analytic predictions for the low mass tail of the mass 
function, and our analysis in this paper does not remedy 
this situation. 

Several authors have considered halo mass functions in 
models with scale-free Gaussian initial fluctuations. The at- 
traction is that clustering should be self-similar in time in 
such models when = 1. This expectation is easy to test 
numerically and appears substantiated by the available sim- 
ulation data (Efstathiou et al 1988, Lacey & Cole 1994). De- 
viations from self-similarity can be used to isolate numerical 
artifacts which break the scaling. The CDM power spectrum 
is not scale-free although the variation of the effective spec- 
tral index with wavenumber is gentle. Recently, a number of 
very large CDM simulations have been performed by Gross 
et al (1998) and Governato et al (1999). These confirmed 
the deviations from the P-S prediction found in earlier work 
and extended coverage to both higher and lower masses. 
An interesting question is how the non-power-law nature of 
the fluctuation spectrum affects the mass function; P-S the- 
ory predicts there to be no effect when the mass function 
is expressed in suitable variables. However, Governato et al 
(1999) find evidence that the high- mass deviation from the 
P-S prediction increases with increasing redshift, although 
the effect is small. 

The essence of group finding is to convert a discrete 
representation of a continuous density field into a countable 
set of "objects". In general, this conversion is affected both 
by the degree of discreteness in the realisation (the particle 
mass) and by the detailed characteristics of the object defini- 
tion algorithm. The definition of object boundaries is some- 
what arbitrary, and one can expect the characteristics of 
the object set to vary according to the specific assumptions 
adopted. This introduces uncertainties when comparing sim- 
ulations and analytic models. We examine here two of the 
standard algorithms used in earlier literature, the friends- 
of-friends and spherical overdensity group-finders (Davis et 
al 1985, Lacey & Cole 1994). Motivated by spherical col- 
lapse models, both attempt to identify virialized regions that 
are overdense by a factor ~ 200 with respect to the global 
mean. Some comparisons of these algorithms have already 
been published by Lacey & Cole (1994) who found small but 
measurable differences. 

In reality, halos possess a variety of physical characteris- 



tics that can be employed by observers to form new, and pos- 
sibly different, countable sets. For example, the Coma clus- 
ter is a single object to an X-ray observer but contains hun- 
dreds of visible galaxies. Similarly, very high resolution sim- 
ulations reveal a myriad of smaller, self-bound "sub-halos" 
within the virial region of each parent halo (e.g. Moore et al 
1999, Klypin et al 1999). The distinction between the larger 
halo and its substructure is largely one of density; the sub- 
halos are bounded at a much higher density than the parent 
object. Such subtleties further complicate comparisons be- 
tween theory, experiment and observation, and in all such 
analysis it is important to take careful account of the specific 
algorithms used to define the objects considered. 

In Section ^, we describe the simulations and the two 
halo finders that we employ when determining halo mass 
functions. In Section ^, we investigate the consistency of 
the mass functions derived from different simulations of the 
rCDM and ACDM cosmologies. In Section^, we present our 
results for the rCDM and ACDM models and compare them 
with P-S theory and with the Sheth- Tormen fitting formula. 
In Section ^, we examine the evolution of the high mass end 
of the mass function with redshift. Using the insight that this 
provides, we generalize our results and show that if a single 
linking length is used to define halos, then a single fitting 
formula, very close to that proposed by Sheth & Tormen, 
accurately describes the mass functions in rCDM ACDM 
SCDM and OCDM models over a wide range of redshifts. 
We present and discuss our conclusions in Section |(| where 
we also explain how to obtain some of the simulation data 
and the analysis software used in our study. Appendix A 
tests the influence of numerical parameters on our measured 
mass functions, while appendix B gives "best fit" analytic 
representations of a number of our mass functions. 



2 SIMULATION DETAILS 
2.1 The simulations 

For the main analysis of this paper we measure halo mass 
functions in a number of N-body simulations carried out by 
the Virgo consortium for two cosmological models, rCDM 
and ACDM, as introduced by Jenkins et al (1998). The sim- 
ulation parameters are listed in Table 1. The largest cal- 
culations are the two "Hubble volume" simulations, each 
with 10 9 particles in boxes of volume 8 and 27/i _3 Gpc 3 ^| re- 
spectively. These simulations allow the mass function to be 
determined for very massive clusters (> 10 15 /i _1 Mq) with 
relatively small Poisson errors. In addition, we have anal- 
ysed several simulations of smaller volumes, but with better 
mass resolution, from which the mass function can be deter- 
mined reliably down to masses of a few times 10 Mq. In 
section 5 we also include data from the large SCDM simu- 
lations carried out by Governato et al (1999) and from the 
simulations of SCDM and OCDM discussed in Jenkins et al 
(1998). 

The models listed in Table 1 are all normalised so as 
to be consistent with the observed local abundance of rich 
galaxy clusters (White, Efstathiou & Frenk 1993, Eke, Cole 

* We express the Hubble constant as Hq = 100/ikms — 1 Mpc~ 1 . 
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Table 1. N-body simulation parameters. 
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& Frenk 1996, Viana & Liddle 1996) and are also consis- 
tent with the standard COBE normalization (e.g. Ratra et 
al 1997). However, for the most part, the precise normali- 
sation is unimportant for the purposes of this paper. The 
power spectrum of the initial conditions of all the simula- 
tions except ACDM-hub & ACDM-512 was set up using the 
transfer function given by Bond & Efstathiou (1984), 



T{k) = 



1 + \aq + (fog) 3 / 2 + (cq) 2 ] 



(1) 



where q = k/T, a = 6.4/i _1 Mpc, b = 3/i _1 Mpc, c = 
1.7/i _1 Mpc and v = 1.13. For the ACDM-hub and ACDM- 
512 simulations, the transfer function was computed using 
CMBFAST (Seljak & Zaldarriaga 1996), assuming h = 0.7 
and fibaryon/i 2 = 0.0196 (Buries and Tytler 1998). In all 
models, the slope of the primordial power spectrum was 
taken to be unity. Full details of how the simulations were 
set up are given in Jenkins et al (1998), for simulations end- 
ing in -gif -int -virgo and -test, and in Evrard et al (2000), 
for simulations ending in -hub. More recently, we have com- 
pleted 512 3 -particle simulations of the rCDM and ACDM 
models using essentially the same code that we used for the 
Hubble volume calculations. 

Each simulation yields a determination of the mass 
function over a mass range dictated by the particle mass 
and the number of particles in the simulation. To aid under- 
standing, we have performed a set of smaller test simulations 
(see bottom of Table 1), designed to investigate the sensi- 
tivity of the mass function to changes in a single numerical 
parameter at a time. The parameters that we vary are the 
particle mass, the gravitational softening and the starting 
redshift. These checks (discussed in appendix A) also give 
an indication of the number of particles required to deter- 
mine the mass function satisfactorily using different halo 
finders. 



fixed shape on the halos. A disadvantage is that it may occa- 
sionally link two halos accidentally through a chance bridge 
of particles. In the limit of very large numbers of particles 
per object, FOF approximately selects the matter enclosed 
by an isodensity contour at 1/6 3 . 

In the SO algorithm, the mass of a halo is evaluated 
in a spherical region. There is one free parameter, the mean 
overdensity, n, of the halos. There are many possible ways of 
centering the spherical region. In our case, the centre is de- 
termined iteratively, after making an initial guess based on 
an estimate of the local density for each particle, re-centering 
on the centre-of-mass, growing a sphere outwards about the 
new centre until it reaches the desired mean overdensity, and 
recomputing the centre-of-mass. After several iterations, the 
motion of the centre becomes small. An alternative method 
consists of centering the sphere on the local maximum en- 
closed mass at fixed overdensity, but this is more compu- 
tationally intensive - a strong disincentive when identifying 
SO groups in the Hubble volume simulations. 

The conventional choices for Q. = 1 cosmologies are 
FOF(6 = 0.2) and SO(k=180) (Davis et al 1985; Lacey & 
Cole 1994). For models where Q ^ 1, the choice is less obvi- 
ous. At late times, groups stop growing and the appropriate 
linking length or bounding density should plausibly become 
constant in physical coordinates. At early times Q ~ 1, and 
it is the corresponding comoving quantities that are most 
naturally kept fixed. One needs to decide how to make the 
transition between these two regimes. Lacey & Cole (1994) 
and Eke, Cole & Frenk (1996) have done this using the 
spherical top-hat collapse model. We adopt their approach 
for the analysis in Section 4, taking FOF(b = 0.164) and 
SO(k = 324) for ACDM at z = 0. This choice cannot be rig- 
orously justified, however, and we find in Section 5 that sim- 
pler results are obtained by using the conventional Einstein- 
de Sitter parameters also for this cosmology and for OCDM. 



2.2 Halo finders 

We use two different algorithms to identify dark matter ha- 
los: the friends-of-friends (FOF) algorithm of Davis et al 
(1985), and the spherical overdensity (SO) finder described 
by Lacey & Cole (1994). The FOF halo finder depends on 
just one parameter, b, which defines the linking length as 
6n _1//3 where n is the mean particle density. An attractive 
feature of the FOF method is that it does not impose any 



3 CONSISTENCY CHECKS 

It is important to check the reliability of mass functions de- 
termined from simulations. From a formal point of view this 
is impossible, since there is no known analytic solution for 
any realistic halo definition. It is, however, possible to check 
for consistency between different simulations and different 
halo definitions. The mass ranges covered by the different 
simulations in each of our sets overlap considerably. As we 
show below, the agreement in these overlap regions is far 
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Figure 1. Friends-of-fricnds differential mass functions for dark 
matter halos in the tCDM and ACDM simulations. Halos were 
identified using linking lengths of 0.2 and 0.164 respectively. The 
different curves correspond to the various simulations detailed in 
Table 1. The mass resolution in the simulations varies by more 
than two orders of magnitude and the volume surveyed by more 
than four orders of magnitude. In all cases, the mass functions 
are truncated at the low mass end at the mass corresponding to 
20 particles, and at the high mass end at the point where the 
predicted Poisson abundance errors reach 10%. The simulations 
match up very well. 



from perfect. One source of difference is just the Poisson er- 
ror due to the finite number of clusters in each simulation. In 
the following we will only use numerical data for which such 
Poisson errors are below 10% and are negligible compared to 
systematic errors. These, we attribute to weak dependences 
of the measured mass functions on some of the numerical pa- 
rameters of our simulations. Such systematics appear to be 
smaller than about 10% for the FOF halo-finder and about 
30% for the SO halo-finder, and are therefore too small to 
be a major concern. It seems unlikely that observational de- 
terminations of halo mass functions will reduce systematic 
uncertainties to such small levels in the foreseeable future. 

Defining n(M) to be the number of halos with mass less 
than M, we plot the differential mass functions, dn/dlnM 
for all rCDM simulations in Table 1 with as ~ 0.6 and for all 
ACDM simulations with erg = 0.9, excluding the last three 
test simulations. The mass functions for the test simulations 



Figure 2. Spherical overdensity differential mass function for 
dark halos in the rCDM and ACDM simulations. Halo masses 
were defined at mean interior overdensities of 180 and 324 respec- 
tively. The different curves correspond to the various simulations 
detailed in Table 1 and are truncated as in Fig.l. The simulations 
do not match up as well here as in Fig.l. Simulations with coarse 
mass resolution seem to underestimate the halo abundance near 
their lower mass limit. See the text for further details. 



are plotted in Appendix A and compared with other simu- 
lations to show how much the mass function can vary as a 
result of changes in the particle mass, gravitational softening 
and starting redshift. 

The FOF mass functions for our two cosmologies are 
displayed in Fig. 1. The numerical data have been smoothed 
with a kernel which is gaussian in log 10 (M) with RMS 
width 0.08. This smoothing erases high frequency Pois- 
son noise and provides a continuous curve for compari- 
son with analytic predictions. Using a 'bin' shape without 
sharp edges also reduces fluctuations at the low mass end 
where the halo masses are integer multiples of the parti- 
cle mass. For a power-law mass function, F = dn/dM oc 
M~ a , such smoothing raises the amplitude by approxi- 
mately exp(a 2 /59). For the rCDM mass function, this fac- 
tor is about 1.03 at lO 14 /i -1 Af , 1.14 at 10 15 /i _1 M Q , and 
1.7 at the highest masses we plot; similar numbers apply to 
ACDM. Poisson statistics lead to RMS uncertainties in the 
smoothed curves which are 
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5F R 



V sin 



dn 



-1/2 



exp(— a /79), 



(2) 



where Vsim is the volume of the simulation considered and 
errors are correlated over a distance comparable to the 
smoothing length-scale. As noted above, we only plot curves 
for which this uncertainty is below 10% and is small com- 
pared to other sources of error. 

As can be seen in Fig. 1, FOF mass functions for differ- 
ent simulations match well even when the number of parti- 
cles per halo is small. The linking parameter was taken to 
be b = 0.2 for rCDM and b = 0.164 for ACDM. For the 
smaller boxes, Poisson fluctuations are clearly visible at the 
high mass end of the plotted curves. Such fluctuations are 
much less pronounced in the curves derived from the Hubble 
volume simulations, because the mass function is then much 
steeper at the point where discreteness noise becomes appre- 
ciable and our smoothing erases features more efficiently. 

Fig. 2 shows a similar comparison of mass functions ob- 
tained using the SO halo-finder. The mean overdensity was 
set to 180 and 324 in the rCDM and ACDM cases respec- 
tively. Here we see a systematic effect: the abundance at 
given halo mass in a simulation with large particle mass is 
always lower than that found in a simulation with smaller 
particle mass. The difference is particularly pronounced be- 
tween the Hubble volume simulations and the others, but 
this merely continues a trend of increasing discrepancy as 
the halo mass at the matching point becomes larger. The 
tests carried out in Appendix A show that these mismatches 
result from resolution effects on halo structure which partic- 
ularly affect halos identified with the SO algorithm. Robust 
results require a minimum of about 100 particles per SO 
halo, but only a minimum of about 20 particles per FOF 
halo. 



4 COMPARISON WITH ANALYTIC MODELS 

In this section, we compare some popular analytic models 
to the mass functions constructed above. It proves conve- 
nient to use the quantity lncr (M, z) as the mass variable 
instead of M, where a 2 (M, z) is the variance of the linear 
density field, extrapolated to the redshift z at which halos 
are identified, after smoothing with a spherical top-hat filter 
which encloses mass M in the mean. This variance can be 
expressed in terms of the power spectrum P(k) of the linear 
density field extrapolated to redshift zero as: 



&!(£) 
2tt 2 



k 2 P(k)W 2 (k;M)dk, 



(3) 



where b(z) is the growth factor of linear perturbations nor- 
malised so that b = 1 at z = 0, and W(k; M) is the Fourier- 
space representation of a real-space top-hat filter enclosing 
mass M at the mean density of the universe. 

We define the mass function /(<r, z; X) through: 



M dn x (M, 



Po 



d In a 



(4) 



where A is a label identifying the cosmological model and 
halo finder under consideration, n(M, z) is the abundance of 
halos with mass less than M at redshift z, and po{z) is the 
mean density of the universe at that time. 



The advantage of using In a -1 as the mass variable is 
most evident when we consider the analytic models to which 
we compare. As we will see below, the Press-Schechter model 
predicts a simple analytic form for f{o, z) which has no ex- 
plicit dependence on redshift, power spectrum or cosmolog- 
ical parameters; a single mass function describes structure 
in all gaussian hierarchical clustering models at all times 
in any cosmology provided abundances are plotted in the 
/ — \sx(5 c /a) plane, where 5 C is a threshold parameter, possi- 
bly dependent on Q, which we discuss later. This very simple 
structure carries over to extensions of the Press-Schechter 
analysis such as that presented by Sheth, Mo & Tormen 
(1999). For a power-law linear power spectrum and Q = 1, 
clustering is expected to be self-similar in time on general 
grounds independent of the P-S model (e.g. Efstathiou et al 
1988, Lacey & Cole 1994), implying again that mass func- 
tions at different times should map onto a unique curve in 
the / — In a" 1 plane (although this curve could be a func- 
tion of the power-law spectral index). For CDM power spec- 
tra, the spectral index varies quite weakly with scale so one 
might expect at most a weak dependence on redshift in this 
plane. At worst, use of In a~ 1 (M) as the "mass" variable 
"factors out" most of the difference in the mass functions be- 
tween different epochs, cosmologies and power spectra, and 
so allows a wider comparison, both among our own simula- 
tions and between these and those by other authors. 

In Subsection 4.1, we describe the two analytic models 
with which we compare explicitly to our numerical results 
in subsection 4.2. 



4.1 Analytic models 

The Press-Schechter model (e.g. Press & Schechter 1974, 
Bond et al. 1991, Lacey & Cole 1993) predicts a mass- 
function given by: 



/to P-S) 



2 5 C 



■ exp 



2a 2 



(5) 



where S c is a threshold parameter usually taken to be the 
extrapolated linear overdensity of a spherical perturbation 
at the time it collapses. In an Einstein-de Sitter cosmology 
S c = 1.686. In other cosmologies S c is sometimes assumed 
to be a weak function of fi and A (see e.g. Eke, Cole & 
Frenk 1996). The P-S mass function has the normalisation 
property: 



f(a; P-S)d In a' 1 = 1. 



(6) 



This implies that all of the dark matter is attached to halos 
of some mass. 

Sheth & Tormen (1999, hereafter S-T) have introduced 
a new formula for the mass function (see their eqn. 10) 
which, for empirically determined choices of two parameters, 
gives a good fit to the mass functions measured in a subset 
of the N-body simulations analysed in this paper (the simu- 
lations ending in -gif in Tables 1 and 2) . Their mass function 
can be expressed as: 



/to S-T) 




1 + 
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a5 2 


— exp 
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2a 2 



(7) 
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where A=0.3222, a = 0.707 and p = 0.3. Sheth, Mo and 
Tormen (1999, SMT) extended the excursion set derivation 
of the P-S formula developed by Bond et al (1991) to in- 
clude an approximate treatment of ellipsoidal collapse, and 
showed this to produce a mass function almost identical in 
shape to eqn. ([?]). Their derivation forces this mass function 
to obey the integral relation of eqn. (^). Below, we com- 
pare their mass function with our N-body results, including 
mass scales, redshifts and cosmologies other than those it 
was originally forced to fit. In a separate paper (Colberg et 
al 2000), we compare the clustering of halos in our Hubble 
Volume simulations with the predictions of the SMT model. 

Other analytic models for the halo mass function have 
been proposed recently by Monaco ( 1997a, b) and Lee & 
Shandarin (1998). These are substantially poorer fits to our 
data than the Sheth, Mo & Tormen (1999) model and we 
do not consider them further in this paper. Other published 
discussions of "Press- Schechter" predictions have made use 
of filter functions other than the spherical top hat, or have 
treated the threshold S c as an adjustable parameter. We will 
not pursue the first of these possibilities at all, and we only 
deviate from standard assumptions about S c in Section (JBJ). 
There we show that taking 8 C — 1.686 in all cosmologies 
leads to excellent agreement with our numerical data if ha- 
los are defined at fixed overdensity independent of Q, rather 
than at an ^-dependent overdensity as in the more conven- 
tional approach. 



4.2 Comparison to simulated mass functions 

In order to make proper comparisons between analytic mod- 
els and our simulated halo mass functions, we smooth the 
analytic predictions in the same way as we smoothed the 
simulation results in Section 3. In practice, this smoothing 
changes the predictions very little; the difference is only per- 
ceptible at high masses where the curves are very steep. 

Interpolation formulae that accurately represent our 
(unsmoothed) mass functions are given in Appendix B and 
plotted (after smoothing) in Figs. 3 and 4. Fig. 3 com- 
pares the FOF(0.2) halo mass function in the rCDM simu- 
lations with the P-S and S-T formulae assuming 5 C = 1.686. 
At the high mass end (lncr^ 1 > 0), the simulation re- 
sults lie well above the P-S prediction. This discrepancy 
has been observed by a large number of authors (e.g. Ef- 
stathiou et al 1988, White, Efstathiou & Frenk 1993, Gross 
et al 1998, Governato et al 1999). Our simulations confirm 
that the divergence increases at even larger halo masses than 
those accessible to previous simulators. For lncr -1 < 0.3, 
the P-S curve overestimates the simulated mass functions, 
and at the characteristic mass M* (where a = S c and so 
lna -1 = —0.52), the halo abundance is only 60% of the 
P-S prediction. This conclusion agrees with the results of 
Efstathiou et al (1988) and Gross et al (1998) who found a 
similar discrepancy for a number of different cosmological 
models. Adjusting the simulated mass functions by altering 
halo-finder parameters, or the analytic predictions by alter- 
ing 8 C , tends to shift the relevant curves in the In / — ln<7 _1 
plane without much altering their shape. As a result, it does 
not significantly improve the overall agreement between the 
P-S model and the numerical data. 

By contrast, the S-T mass functions give an excellent 
fit to the N-body results in Fig. 3. Good agreement is to be 




Figure 3. A comparison of analytic models with the halo mass 
function at z = in our N-body simulations of the tCDM cosmol- 
ogy. Halos were found using the FOF algorithm with b = 0.2. The 
short dashed lines show results from the individual tCDM simu- 
lations used in this paper, the solid curve is the fit of eqn. ( |Bl[ ) 
to the combined results of the simulations, while the dashed line 
shows the P-S prediction and the dotted line, the S-T prediction, 
both using <5 C = 1.686. The arrow marks the characteristic mass 
scale, M», where <r(M*) = 8 C , and corresponds to the position of 
the peak in the Press-Schechter mass multiplicity function. Note 
that we use natural logarithms in this plot. 



expected at the low mass end since a subset of our simula- 
tion data was used by Sheth & Tormen (1999) to determine 
the parameters of their fitting function. This fitting function 
matches our numerical data for rCDM-FOF over their en- 
tire range, including large masses which were not considered 
in the original fit. 

Fig. 4 shows the analogous comparison in the case of the 
ACDM model. Here, the FOF(0.164) halo mass function in 
the simulations is compared to the P-S and S-T predictions 
using 8 C — 1.675 as advocated by Eke et al (1996) for this 
cosmology. With these choices of parameters, the P-S curve 
gives a better fit to the simulated mass function at high 
masses than in the rCDM case, although it still underesti- 
mates the abundance of high mass clusters and substantially 
overestimates the abundance near M*. The S-T model is a 
poorer fit to the numerical data than for rCDM , substan- 
tially overestimating abundances at high masses. However, 
as we will show in the next section, this disagreement is all 
but removed by different (and simpler) assumptions about 
the appropriate parameters for halo-finders when Q < 1. 

All the above comparisons refer to halos identified us- 
ing the FOF halo-finder with standard parameters. We have 
checked that very similar results are obtained if halos are 
identified using the SO halo-finder, again with standard pa- 
rameters. This similarity was previously noted by Tormen 
(1998). 

It is interesting to compute the fraction of the to- 
tal cosmic mass density which is included in halos over 
the full range of validity of our simulation mass functions 
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Figure 4. A comparison of analytic models with the halo mass 
function at z = in our N-body simulations of the ACDM cosmol- 
ogy. Halos were found using the FOF algorithm with b = 0.164. 
The short dashed lines show results from the individual ACDM 
simul atio ns used in this paper, the solid curve is the fit of 
eqn. ( |B2| ) to the combined results of the simulations, while the 
long dashed line shows the P-S prediction and the dotted line, the 
S-T prediction, both using <5 C = 1.675. The arrow marks the char- 
acteristic mass scale, M*, where ct(M») = <5 C , and corresponds to 
the position of the peak in the Press-Schechter mass function. 
Note that we use natural logarithms in this plot. 



(~3x 10 11 to ~ 5 x 10 15 /i _1 M Q ). Using our fits to the 
FOF data in Figs 3 and 4, we find this fraction to be 0.37 
both in rCDM and in ACDM . The corresponding fraction 
for the P-S mass function is 0.50. Where is the remaining 
mass? Clearly, higher resolution simulations will show some 
proportion to be in halos too small to have been resolved 
by our current simulations. However, there is no guarantee 
that such higher resolution simulations will produce a total 
halo mass fraction which converges to unity. Much of the 
dark matter may not lie in halos identified by an FOF(0.2) 
(or other) halo-finder, but rather in a smooth, low-density 
component, perhaps in extensions of the identified halos be- 
yond their artificial b = 0.2 boundaries. This possibility is 
suggested by the fact that our simulated mass functions re- 
main low compared to the P-S curve at the smallest masses 
for which we can measure them. A straightforward extrap- 
olation of our FOF(0.2) curves contains only ~ 70% of the 
total mass. Unfortunately, this extrapolation is not unique 
as shown by the fact that the S-T mass function can fit all 
our numerical data and yet is normalised to give a total halo 
mass fraction of unity. More numerical data are needed to 
study the low mass behaviour of the mass function in order 
to resolve this issue. 



5 TOWARDS A GENERAL FITTING 
FORMULA 

In this section, we first show that, when expressed in terms 
of In a~ , the mass functions in our two cosmological mod- 
els vary only very little with redshift, or equivalently, with 
the effective slope of the power spectrum. With our cur- 
rent definition of halos, however, the mass functions in the 
rCDM and ACDM models are different. In subsection 5.2, 
we show that this difference all but disappears if, instead of 
using a FOF linking length that varies with Q, we simply 
identify clusters with a constant linking length, b = 0.2, in 
both cosmologies. A general fitting formula can then accu- 
rately describe the halo mass function in a wide range of 
cosmological models. 



5.1 Comparison of the mass function at different 
redshifts 

For a scale-free power spectrum, the mass function expressed 
in terms of In cr _1 should be independent of redshift. Any dif- 
ferences must be due to Poisson sampling or to systematic 
errors introduced by numerical inaccuracies that break the 
scaling laws. For a CDM spectrum, for which the spectral 
slope is a function of scale, there could, in principle, be gen- 
uine evolution of the mass function but, if the amount of 
evolution is small, it may be masked by numerical effects. 

In the tCDM model, a suitable rescaling of the length 
and mass units allows the mass function at non-zero red- 
shift to be regarded as the redshift zero mass function of 
a simulation with a different power spectrum but identical 
normalisation. Differences in the / — In cr — 1 plane between 
mass functions at different redshifts can therefore indicate a 
dependency of the mass function either on the shape of the 
power spectrum or on numerical effects. In order to exploit 
this regularity, and also to compare our results with those 
of Governato et al (1999), we parametrise each simulation 
output by an effective power spectral slope n e fi, 



, diner 
'dlnM 



(8) 



where we evaluate the derivative at the point where a = 0.5, 
corresponding roughly to cluster scales for Q — 1 and = 0. 
For a scale-free power spectrum, n e ff is the power-law in- 
dex in P(k) oc k". We have calculated the FOF(0.2) mass 
function for the rCDM-hub simulation at redshifts z = 
0.0, 0.18, 0.44 and 0.78, for which n cff = -1.39, -1.48, -1.58 
and -1.70 respectively. We can extend this range of spectral 
slopes by considering, in addition, the mass functions de- 
termined by Governato et al (1999). Four outputs of their 
SCDM model correspond to cr 8 = 1.0,0.7,0.47 and 0.35, 
for which n e fi = —0.71, —0.90, —1.12 and -1.28. respectively. 
Thus, the two sets of simulations cover the range -0.71 to 
-1.7 in spectral slope but do not overlap. 

Fig. 5 shows the mass functions determined from these 
two simulation sets. We stress that the present comparison 
supersedes the preliminary comparison presented by Gover- 
nato et al. The agreement between the various mass func- 
tions is good, with a variation of only 30% over the range 
in n e fi from —0.7 to —1.7. On closer inspection, the curves 
from the rCDM-hub simulation form a sequence in which 
the mass function decreases with increasing redshift (or with 
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ln^" 1 ) 

Figure 5. A comparison of mass functions in different simulations 
at different epochs. The full curves show the FOF(0.2) mass func- 
tions for the rCDM-hub simulation at redshifts z = 0,0.18,0.44 
and 0.78. The dashed curves show the corresponding functions for 
the SCDM simulations of Governato et al (1999) at four epochs for 
which erg =1.0,0.7,0.47 and 0.35 respectively. The heavy dashed 
curve shows the P-S model function. Both simulation datasets 
show a weak trend with erg, but the trends are opposite! See the 
text for discussion. Note that we use natural logarithms in this 
plot. 

decreasing n e ff) while the curves from Governato et al (1999) 
also vary systematically, but in the opposite direction! The 
reason for these differing weak trends is unclear. They could 
reflect the differences in power spectrum shape between the 
two simulations, or perhaps small systematic numerical er- 
rors in one or both of them. (Note that since the Hubble 
volume simulation follows a much larger volume than the 
simulation of Governato et al, its mass function is much 
better determined for large masses.) However, the magni- 
tude of these trends (a total range below 30%) is sufficiently 
small as to be of no real interest. 

Fig. 6 shows mass functions at various epochs in the 
ACDM model. Here the curves are affected not only by the 
variation of spectral shape with redshift (as for rCDM) but 
also by the variation in the linking length which defines the 
simulated halos. For this plot we chose to follow the relation 
proposed by Eke et al (1996). The z = mass function 
(the lowest solid curve in the figure) is then significantly 
below the 2 = rCDM mass function (the light dashed 
curve). Furthermore, in contrast to rCDM, the ACDM mass 
function initially moves upwards in the / — ln<r _1 plane with 
increasing redshift. This can be attributed in large part to 
the increasing linking length (see below) . The upward trend 
reverses between z = 0.96 and z = 1.44, perhaps reflecting 
the rapid convergence of Q and 6/6 to the rCDM values. As 
in the rCDM case, the differences are all rather small. We 
conclude that although with the halo definition used in this 
section, the rCDM and ACDM mass functions are slightly 
different, there is no evidence for a significant systematic 
variation in the high mass end of the mass function with 
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Figure 6. A comparison of mass functions in different simulations 
at different epochs. The full curves show the FOF mass function 
for the ACDM-hub simulation at redshifts z = 0, 0.27, 0.96 and 
1.44. The first three form a sequence going from bottom to top, 
while the z = 1.44 output is slightly lower than z = 0.96. The 
light dashed curve shows the rCDM-hub (z = 0) mass function. 
The heavy dashed curve shows the P-S theory model function. 
The trend with redshift is in the opposite direction to that in 
the tCDM model except at the highest redshift where the trend 
appears to reverse. The initial trend reflects the varying linking 
length used to define the halos. For rCDM this choice was inde- 
pendent of redshift. Note that we use natural logarithms in this 
plot. 

redshift, with power spectrum slope n e «, or with Q, once it 
is transformed to the appropriate variables. 

5.2 A general fitting formula 

The results of the last section suggest that it may be possi- 
ble to find a universal formula which provides a reasonably 
accurate description of the halo mass function over a wide 
range of redshifts and in a wide range of cosmologies. Here, 
we show that a formula very similar to that suggested by S-T 
is indeed successful if halos are defined at the same overden- 
sity at all times and in all cosmologies independent ofQ. We 
concentrate on halos defined using FOF(0.2), but have found 
very similar results using SO(180). Table 2 lists the simu- 
lation outputs used in this section. These include the data 
already analysed from our own simulations and those of Gov- 
ernato et al , together with additional data from the SCDM 
and OCDM (open with firj = 0.3) simulations in Jenkins et 
al (1998). Because of the extended range of power spectral 
shapes involved, we 'deconvolve' the smoothed mass func- 
tions by multiplying by exp(— a 2 /59), where a is the local 
slope of log 10 (dn/d log 10 M) (see Section 3). 

Fig. 7 shows all the data plotted on the / — lncr -1 plane. 
As before, all our curves are truncated at the mass corre- 
sponding to 20 particles (50 particles for the Governato et al 
data) and where the Poisson error first exceeds 10%. These 
curves encompass a wide range not only in In cr _1 but also in 
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Figure 7. The FOF(0.2) mass functions of all the simulation out- 
puts listed in Table 2. Remarkably, when a single linking length 
is used to identify halos at all times and in all cosmologies, the 
mass function appears to be invariant in the / — lncr -1 plane. A 
single formula (eqn. ^) , shown with a dotted line, fits all the mass 
functions with an accuracy of better than about 20% over the 
entire range. The dashed curve show the Press-Schechter mass 
function for comparison. 



Figure 8. The residual between the fitting formula, eqn. and 
the FOF(0.2) mass functions for all the simulation outputs listed 
in Table 2. The lines are colour codes according to the value 
of n e ff- Solid lines correspond to simulations with = 1, short 
dashed lines to flat, low f2o models, and long dashed lines to open 
models. The heavy dashed line shows the Sheth-Tormen formula 
(equation (7)) 



effective power spectrum slope, n e s. Cosmic density ranges 
over 0.3 < f2 < 1.0. Remarkably, all curves lie very close 
to a single locus in the / — lncr -1 plane. The use of a con- 
stant linking length has significantly reduced the amplitude 
of the redshift trend seen in the ACDM model in the previ- 
ous section, and also places the OCDM outputs on the same 
locus. 

The numerical data in Fig. 7 are well fit by the following 
formula: 

f(M) = 0.315 exp [ - | liio- -1 + 0.61| 3 ' 8 ] , (9) 

valid over the range —1.2 < In cr _1 < 1.05. 

In Fig. 8 we plot the difference between the measured 
mass functions and our fitting formula. The fit is good to 
a fractional accuracy better than 20% for -1.2 < In a' 1 < 
1. This is a very significant improvement over the Press- 
Schechter formula which would exceed the vertical limits of 
the plot! The curves for the open models with SI = 0.3 are 
slightly high in this plot but only by ~ 10%. The spread 
between the different curves increases for large lncr - . This 
may simply reflect the fact that the very steep high mass end 
of the mass function is sensitive to numerical effects which 
change the masses of clusters in a systematic way. 

As shown in the figure, eqn. 9 is very close to the formula 
proposed by Sheth & Tormen (1999); there is a small dif- 
ference in the high mass tail, for lncr" 1 > 0.9. A non-linear 
least-squares fit of eqn. 7 to the simulation data in Fig. 8 
shows that the fit can be improved by adjusting the param- 
eters A, p and a. If the normalisation constraint, eqn. 6, is 
ignored, all three parameters can be allowed to vary freely. 
In this case, the best fit is obtained for A = 0.353, p = 0.175 



and a = 0.73 (and 0.84 of the mass is in halos). If the nor- 
malisation constraint is enforced, then only two parameters 
can vary; in this case the fit is not as good as that provided 
by eqn. 9. 

Fig. 9 shows the area of the In cr _1 — n e g parameter space 
which is occupied by the data in Fig. 8. The high mass end 
has good coverage in n e s with values up to -2.3. In prac- 
tice this means that for currently popular cosmologies, the 
high mass tail of the halo mass function is well determined 
at all redshifts where galaxies have so far been observed. 
The rCDM-gif simulation at z = 4.04 has n off = -2.26 
and agrees well with rCDM-hub which determine the high 
mass end of the mass function at more recent epochs. We 
have checked that the rCDM-gif z = 5 output, which has 
n e s = —2.35, is also consistent with our fitting function, 
although its Poisson errors are slightly too large to satisfy 
our 10% criterion for inclusion in Figs. 7-9. For low SI our 
fitting formulae should work to even higher redshift. Since 
fluctuations grow more slowly for low f2, and the value of 
as required to match current cluster abundances is higher, 
low density cosmologies predict substantially less negative 
values for n e g at each redshift. 



6 CONCLUSIONS 

We have derived halo mass functions at z — from sim- 
ulations of the rCDM and ACDM cosmologies over more 
than four orders of magnitude in mass, ~ 3 x 10 11 to 
~ 5 x 10 15 hT 1 Mq. In particular, our two Hubble volume 
simulations provide the best available predictions for the 
abundance of the most massive clusters. We have checked 
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Table 2. Parameters of N-body simulations used in Figs. 7, 8 and 9. Columns 2-5 give the cosmological 
parameters, and the normalisation of the power spectrum at the present epoch. Columns 6 and 7 give the 
number of particles and the size of the simulation cubes. Column 8 gives the redshift of the output at which 
FOF groups were found, and Column 9 gives the effective power spectrum slope at <r = 0.5 defined by 
equation pi 



Simulation 


Ho 


Ao 


r 


a-8 


A^part 


LboJh-iMpc 


z 


™eff 


rCDM-gif 


1.0 


0.0 


0.21 


0.60 


16 777 216 

-L U 1 1 1 £J _L U 


84.5 


0.00 


-1.39 


-rCDM-gif 


1.0 


0.0 


0.21 


0.60 


16 777 216 


84.5 


1.94 


-1.96 


rCDM-gif 


1.0 


0.0 


0.21 


0.60 


16 777 216 


84.5 


2.97 


-2.13 


-rCDM-gif 


1.0 


0.0 


0.21 


0.60 


16 777 216 

-LU 1 1 1 ilv 


84.5 


4.04 


-2.26 


i-CDM-int 

/ V_y_L/lVl lllLr 


1.0 


0.0 


0.21 


0.60 


1 6 777 21 6 

1_U 1 1 1 Zj ±\J 


160.3 


0.00 


-1.39 


TriDM-virm 

/ v_y_L/lVl V 11 tl W 


1.0 


0.0 


0.21 


0.60 


1 6 777 21 6 

1_U 1 1 1 ±\J 


239.5 


0.00 


-1.39 


tCDM-512 


1.0 


0.0 


0.21 


0.51 


134 217 728 


320.7 


0.00 


-1.48 


-rCDM-hub 


1.0 


0.0 


0.21 


0.60 


1 000 000 000 


2000.0 


0.00 


-1.39 


rCDM-hub 


1.0 


0.0 


0.21 


0.60 


1 000 000 000 

i UUU vjvjvj v / \ > \ r 


2000 

^vjvjvj . u 


0.18 


-1.48 


rCDM-hub 


1.0 


0.0 


0.21 


0.60 


1 000 000 000 

i uuu vjvjvj vjvjvj 


2000 

^vjvjvj . u 


0.44 


-1.58 


rCDM-hub 


1.0 


0.0 


0.21 


0.60 


1 000 000 000 

i uuu vjvjvj vjvjvj 


2000 

^VJVJVJ .VJ 


0.78 


-1.70 


SCDM-gif 


1.0 


0.0 


0.50 


0.60 


16 777 216 


84.5 


0.00 


-0.99 


SCDM-virgo 


1.0 


0.0 


0.50 


0.60 


16 777 216 


239.5 


0.00 


-1.08 


ACDM-gif 


0.3 


0.7 


0.21 


0.90 


16 777 216 


141.3 


0.00 


-1.17 


ACDM-gif 


0.3 


0.7 


0.21 


0.90 


16 777 216 


141.3 


0.52 


-1.32 


ACDM-gif 


0.3 


0.7 


0.21 


0.90 


16 777216 


141.3 


2.97 


-1.72 


ACDM-gif 


0.3 


0.7 


0.21 


0.90 


16 777 216 


141.3 


5.03 


-2.00 


ACDM-512 


1.0 


0.0 


0.21 


0.90 


134 217 728 


479.0 


0.00 


-1.25 


ACDM-512 


1.0 


0.0 


0.21 


0.90 


134 217 728 


479.0 


5.00 


-2.08 


ACDM-hub 


0.3 


0.7 


0.21 


0.90 


1000 000 000 


3000.0 


0.00 


-1.25 


ACDM-hub 


0.3 


0.7 


0.21 


0.90 


1000 000 000 


3000.0 


0.27 


-1.32 


ACDM-hub 


0.3 


0.7 


0.21 


0.90 


1000 000 000 


3000.0 


0.96 


-1.51 


ACDM-hub 


0.3 


0.7 


0.21 


0.90 


1000 000 000 


3000.0 


1.45 


-1.62 


OCDM-gif 


0.3 


0.0 


0.21 


0.85 


16 777 216 


141.3 


0.00 


-1.20 


OCDM-virgo 


0.3 


0.0 


0.21 


0.85 


16 777 216 


239.5 


0.00 


-1.20 


SCDM-gov 


1.0 


0.0 


0.5 


1.0 


46 656 000 


500.0 


0.00 


-0.71 


SCDM-gov 


1.0 


0.0 


0.5 


1.0 


46 656 000 


500.0 


0.43 


-0.90 


SCDM-gov 


1.0 


0.0 


0.5 


1.0 


46 656 000 


500.0 


1.13 


-1.12 


SCDM-gov 


1.0 


0.0 


0.5 


1.0 


46 656 000 


500.0 


1.85 


-1.28 



the sensitivity of our mass functions to choice of group- 
finder, to limiting overdensity, and to numerical parameters 
such as softening, particle mass and starting redshift (see ap- 
pendix A). Most dependences are weak. In particular, with a 
friends-of-friends group finder, the mass function is robustly 
determined with systematic uncertainties at or below the 
10% level for groups containing 20 particles or more. Some- 
what higher particle numbers are needed for reliable results 
with a spherical overdensity group-finder. 

The mass functions we find for these two cosmologies, as 
well as for additional simulations of the SCDM and OCDM 
cosmologies, display the kind of universality predicted by 
the Press-Schechter model. When expressed in suitable vari- 
ables, the mass function is independent of redshift, power 
spectrum shape, fl and A. This universality only obtains 
when we define halos in our simulations at fixed overden- 
sity independent of Q,. When we use the spherical collapse 
model to define the appropriate overdensity, as suggested 
by Lacey & Cole (1994) and Eke et al (1996), we find mass 
functions for low density cosmologies which vary weakly but 
systematically with redshift. As has been noted before, the 



Press- S 
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proposed by Sheth & Tormen (1999) is a very good fit to the 
universal mass function we find, and is close to the best fit 
we give as equation (9). As shown by Sheth, Mo & Tormen 
(1999) this shape is a plausible consequence of extending the 
excursion set derivation of the P-S model to include ellip- 
soidal collapse. 

Our "universal" mass function has considerable gener- 
ality since the simulations cover a wide range of parameter 
space: Q in the range 0.3 — 1, effective spectral power index 
in the range —1 to —2.5, inverse fluctuation amplitude in 
the range —1.2 < lncr -1 < 1.05. For standard cosmologies 
this corresponds approximately to the mass range 10 11 to 
10 16 h~ Y Mq at z = 0, and to the high mass tail of the mass 
function out to redshift 5 or more. More work is needed 
to check the abundances predicted for low mass halos, in 
particular to see whether all mass is predicted to be part of 
some halo, or whether some fraction makes up a truly diffuse 
medium. 

Data for many of the simulations analysed in this 
paper, as well as cluster and galaxy c atalogs created as 
part of other project s, are available from http://www.mpa- 



garching.mpg/ Virgo 



in all cosmologies. On the other hand, the fitting function 



Software to convert eqn. (H) into a mass function, for 
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Figure 9. The parameter range covered by the simulation out- 
puts listed in Table 2. The high mass end of the mass function is 
well determined for a range of values of the parameter n c ff , while 
the low mass end is only determined well for high values of n e g. 
Dashed lines indicate models with Qq < !• 



a given power spectrum is available on request from ARJ 
(A.R.Jenkins@durham.ac.uk). 
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APPENDIX A: A RESOLUTION STUDY 

To investigate the effects of mass resolution, gravitational 
softening and starting redshift, we carried out the three test 
simulations detailed at the bottom of Table 1. These simula- 
tions are all of an identically sized region, 200/i _1 Mpc on a 
side, and are set up so that corresponding waves have iden- 
tical phases and amplitudes in all three cases. Two of the 
simulations, rCDM-testl and rCDM-test2, have the same 
particle mass as the Hubble volume simulation. The former 
has essentially identical numerical parameters to the rCDM- 
hub simulation, while the latter differs in having a smaller 
gravitational softening length. The rCDM-test3 simulation 
has 8 times as many particles as rCDM-testl, but is other- 
wise identical. As a check of the effect of the starting red- 
shift we repeated rCDM-testl a second time but starting 
at redshift 14 rather than 29 (as for the other tests and 
rCDM-hub). 

Fig. Al shows the mass functions for halos found with 
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Figure Al. A resolution study of the effect of varying the particle 
mass on the mass function of FOF halos for three values of the 
linking parameter, b. The dashed lines show the mass functions 
obtained from rCDM-test3, and the solid lines from rCDM-testl. 
These simulations have the same phases but differ by a factor of 
8 in particle mass. The mass function is plotted for halos with 
20 particles and above. The filled circles show the corresponding 
mass function in rCDM-hub which has the same particle mass as 
rCDM-testl. Defined in this way, the mass function is remarkably 
robust. 



Figure A2. A resolution study of the effect of varying the particle 
mass on the mass function of SO halos. The dashed line shows the 
mass function obtained from rCDM-test3, and the solid line from 
rCDM-testl . These simulations have the same phases but differ 
by a factor of 8 in particle mass. The mass function is plotted 
for halos with 20 particles and above. The filled circles show the 
corresponding mass function in rCDM-hub which has the same 
particle mass as rCDM-testl. The dotted line shows the SO mass 
function when 1 in 8 particles are randomly sampled from rCDM- 
test3. This test shows that the SO mass function is sensitive to 
the particle mass, unlike the FOF mass function in Fig. Al. 



the FOF group-finder using three different linking lengths 
(6 = 0.1,0.15 and 0.2), for rCDM-testl and rCDM-test3. 
Consistent with the results of Section 3, the FOF mass func- 
tion is only very weakly dependent on the particle mass. 
Changing the softening makes an even smaller difference and 
we do not plot the curve for rCDM-test2; the average RMS 
difference between the mass functions with 30h~ kpc and 
100/i _1 kpc gravitational softening is just 0.0135 dex. If the 
softening were increased significantly beyond 100/i~ kpc, 
then we would expect the mass function to decrease as ha- 
los become more and more diffuse. The mass function for 
rCDM-hub is shown also and agrees well with rCDM-testl, 
as it should. We conclude that the mass function of FOF 
halos is remarkably robust to the numerical parameters in 
the simulations. 

When halos are identified using the SO group- finder, the 
situation is slightly different. As Fig. A2 shows, the SO mass 
function from rCDM-testl agrees well with that in rCDM- 
hub over the corresponding range of masses. Changing the 
softening to 30/i _1 kpc in rCDM-test2 does not change the 
mass function much. However, changing the mass resolution 
by a factor of 8, as in rCDM-test3, leads to a significant 
change in the mass function at the low mass end, which 
is now much closer to that determined from rCDM-virgo, 
a simulation with similar mass resolution. Resampling the 
high-mass resolution simulation, rCDM-test3, at random, 
at a rate of l-in-8 does not have any noticeable effect on the 
mass function. This suggests that the resolution-dependence 
of the mass function close to the resolution limit is not due 
simply to details of the cluster-finding algorithm (such as 



the position of the halo centre). Rather, it suggests that the 
difference may reflect genuine differences in the structural 
properties of marginally resolved halos in simulations with 
different particle numbers. 

The mass function is not very sensitive to the starting 
redshift unless this is so low that the neighbouring Zel'dovich 
displacements are so large as to interfere with the formation 
of the first non-linear objects. In practice, all the simulations 
compiled here pass this test easily. As a additional check, we 
repeated rCDM-testl, starting at z — 14. This made very 
little difference to the mass function overall, causing an RMS 
average difference over the measured mass function of only 
0.02 dex. 

In summary, our tests indicate that we can derive an 
accurate mass function for FOF groups from the Virgo sim- 
ulations (including the Hubble volume). The mass functions 
agree well in the overlap regions of simulations of different 
mass resolution even when halos with only 20 particles per 
group are included. Perhaps surprisingly, the mass functions 
of SO groups do not match up nearly as well at such low par- 
ticle numbers. A minimum of ~ 100 particles is required to 
provide reasonable agreement in the overlap regions in this 
case. 
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Figure A3. The residuals between the fitting formulae for the mass function given in appendix B and (i) the simulations (solid lines) 
and (ii) the Press-Schechter and Sheth-Tormen models (dashed lines.) The simulation curves are plotted up to the point where the 
fractional Poisson error reaches 10%. The simulations themselves can be seen to be mutually consistent at about the 10% level although 
the differences are, for the most part, larger than the Poisson errors. The P-S curve shows an excess at low masses and a deficit at high 
masses. The Sheth-Tormen mass function fits the low mass end well, but overestimates the number of high mass clusters. Note that we 
use natural logarithms in this plot. 



APPENDIX B: FITTING FORMULAE FOR 
THE FOF AND SO MASS FUNCTIONS 

Here, we give fitting formulae for the (unsmoothed) mass 
functions of FOF and SO halos, using the standard values 
of the group-finding parameters given in Section 2. These 
fits are plotted in Figs. 3 and 4 of Section 4. 

1. The friends-of-friends group finder. 
rCDM/FOF(0.2): 

f(M) = 0.307 exp [ - | In a" 1 + 0.61| 3 ' 82 ] , (Bl) 

in the range -0.9 < lncr" 1 < 1.0. 
ACDM/FOF (0.164): 

f(M) = 0.301 exp [ - | In a" 1 + 0.64| 3 ' 88 ] , (B2) 

in the range -0.96 < In a" 1 < 1.0. The difference between 
these fitting functions and the actual mass functions is typ- 
ically less than 10%. 



2. The spherical overdensity group finder: 
rCDM/SO(180): 

f(M) = 0.301 exp [ - | In a" 1 + 0.64| 382 ] , (B3) 

for the range —0.5 < lncr" 1 < 1.0. 
ACDM/SO(324): 

f(M) = 0.316 exp [ - | In cr" 1 + 0.67| 382 ] , (B4) 

for the range —0.7 < lncr" 1 < 1.0. 

As discussed earlier, the differences between the SO 
mass functions are larger than for FOF halos. Differences 
between the fit and the mass functions within the quoted 
mass range can be as large as 20%. 

Fig. Bl shows the residuals between the four fitting for- 
mulae quoted above and the mass functions in the simu- 
lations. Also shown are the differences between our fitting 
formulae and both the P-S and S-T models. The simulation 
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curves are truncated at the high mass end at the point where 
the fractional RMS Poisson error reaches 10%. The simula- 
tion mass functions may be seen to be consistent with one 
another at a level of about 10%. As seen in previous plots, 
the P-S curve is too high at low masses (low lna -1 ) and 
too low at high masses. For the fit proposed by Sheth & 
Tormen (1999, eqn. 10) good agreement is to be expected 
at the low mass end since a subset of the simulation data 
used here (the -gif simulations) was used by Sheth & Tor- 
men (1999). Although their mass function is normalised (so 
that all the mass is attached to halos), it does match our 
fits for rCDM-FOF rather well over their entire range. For 
the other models, the Sheth- Tormen fit overestimates the 
number of very massive clusters. Because of the normali- 
sation constraint, the Sheth- Tormen mass function exceeds 
the P-S predictions for extremely low mass halos but this 
occurs at a point well beyond what can be tested easily in 
N-body simulations. Whether such low-mass behaviour is 
correct remains to be determined. 
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